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Docket No: .C38435/109700CON 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
//? re Application of : . ' ) 

Akira ASAKURA ef a/. ) 

Examiner: M. Walicka 

Serial No;: 09/470,667- ) 

Art Unit: 1652 
Filed: December 22, 1999 ) : 

For: NOVEL ALCOHdL/ALDEHYDE ) 

DEHYDROGENASES 

. Commissioner for Patents 
Washington, DC 20231 

SECOND DECLARATION OF DR. MASAKrt shinjqh um b er 37 n f r ^ 1 i'^? 

Sir: . . . • 

I, Masako Shinjoh, a citizen and resident of Japan, hereby declare as 

follows: 

1. I am employed by Nippon Roche Research Center of Nippon Roche 
K.K., Kajiwara 200, Kamakura-shi, Kanagawa-ken 247-8530, Japan 
(hereafter "NRKK"). 1 currently hold the position of genetic engineer 
at NRKK. A copy of my curriculum vitae is attached as Exhibit 1 . 

.2. I am a coinventor of U.S. patent application No: 09/470,667 (the 
'667 application). The '667 application is summarized in nriore detail 
In the FIRST DECLARATION OF DR. MASAKO SHINJOH UNDER 
37 C.F.R. § 1 .1 32 ("First Declaration") filed concurrently herewith. 
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3.. As described in the First Declaration, after reviewing the Sequence 
Listing filed with the '667 application, how the nucleotide and amino 
acid . sequences that make up the Sequence Listing were 
incorporated into the '667 application, and the original nucleotide 
printouts from the sieqiiencing machine used to read the 
experimentally derived sequences, I have -come to the conclusion 
that SEQ ID NOs:1, 3, and 7 each contain a single base (SEQ ID 
NOs:1 and 3) or a single amino acid (SEQ ID NO:7) error that arose 
through.typing errors. 

4. After the '667 application was filed, I found discrepancies in the 
nucleotide and amino acid sequences identified in the '667 
application as SEQ ID NO: 1, SEQ ID NO: 3. and SEQ ID NO: 7, 
respectively when compared to the computer printouts generated. by 
the nucleotide sequencing machine used to read the nucleotide 
sequences that ultimately became SEQ ID NOs:1 and 3 in the '667 
application. As set forth In more detail below. I believe that each of 
these discrepancies was the result of a typing mistake made when I 
prepared the sequence listing data for an intiamal Research Report. 

5. The original sequence data underiying each of the sequences 
disclosed, in the Sequence Listing of the '667 application were 
generated by a nucleotide sequencing machine, and could not be 
converted into an electronic file for manipulation in an electronic 
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medium {e.g., a word processor). Accordingly, I manually typed the 
sequences ultimately disclosed in the '667 application into an 
electronic format using the original sequence data generated by .the 
hucleptide sequencing machine. It is my belief that vyhen the 
original sequence data was retyped into an electronic fomnat that a 
Single base in each of SEQ ID NOs:1 and 3 was entered in error, 
and that because of the en-or in SEQ ID NO:3, its deduced amino 
acid sequence (SEQ ID NO:7) also contained a single amino acid 
error. The manually re-typed sequences, including the 
urirecognized typographical mistakes, were then incorporated into 
the foreign priority application (EP 96115001 filed September 19, 
1996), which became the basis for the '667 application including the 
Sequence Listing contained therein. (Exhibit 2). 

A copy of the original printout from the nucleotide sequencing 
machine of the open reading frame of Enzyme A Including the 
nucleotide sequence (which became SEQ ID NO:1 in the '667 
application) and its deduced amino acid sequence (which became 
SEQ ID NO:5 in the '667 application) is attached as Exhibit 3. I 
have compared the nucleotide and deduced anriino acid sequences 
from the original printout with the sequences disclosed as SEQ ID 
NOs:1 and 5 in the '667 application, and have found that the 
nucleotide at position 852 of SEQ ID NO:1 is a "G" whereas the 
corresponding nucleotide in the original printout is a "C." It is my 



belief that the correct nucleotide at position 852 is "C," not "G" as 
recited in SEQ ID NO:1. 

7. Because of the redundancy of the genetic code, when SEQ ID NO: 1 
was translated, the deduced amino acid encoded by the codon 
containing the nucleotide at position 852 did not change compared 
to the deduced aftiino acid sequence generated by the nucleotide 
sequencer as set forth in the original printout. Thus, both 
sequiBnces arie identical. 

8. A copy of the original printout from the nucleotide sequencing 
machine of the open reading frame of Enzyme A" including the 
nucleotide sequence (which became SEQ ID NO;3 in the '667 
application) and its deduced amino acid sequence (which became 
SEQ ID NO:7 In the '667 application) is attached as Exhibit 4. I 
have compared the nucleotide and deduced amino acid sequences 
from the original printout with the sequences disclosed as SEQ ID 
NOs:3 and 7 in the *667 application and have found that the 
nucleotide at position 644 of SEQ ID NO:3 Is an "A" whereas the 
corresponding nucleotide in the original printout is a "C." It is my 
belief that the correct nucleotide at position 644 is "C," not "A" as 
recited in SEQ. ID NO:3. 

9. The replacement of "A" for "C" at position 644 in SEQ ID NO:3 also 
led to the translation of a different amino acid ("Asn".was translated 
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instead of 'Thr?' at position 192) from the codon containing the error 
at nucleotide position 644.. It is my belief that the correct amino acid 
at position 192 of SEQ ID NO:7 is>Thr," not "Asn" as currently 
recited. 

10. To verify the correctness of the nucleotide arid amino acid 
sequences identified on the original printouts generated by the 
nucleotide sequencing machine, which, were the bases for the 
disclosure of SEQ ID NOs:1, 3, and 7 In the '667 application, 1 
obtained a sample of Gluconobacter oxydans strain DSM 4025. the 
same microorganisnri from which the nucleotide sequences of SEQ 
ID NOs:1 and 3 were isolated (as disclosed in the '667 application), 
from the Inteinational Depository Authority, Deutsche Samnrilung 
von Mikroorganismen und Zellkulturen GmbH ("DSMZ"), a publicly 
available cell depository. 

11. With the assistance of Mr. Naoki Itoh, NRKK's Patent & Licensing 
Manager, I then contracted with an independent nucleotide 
sequencing company (Sawady - see the First Declaration) to use 
the Gluconobacter oxydans DSM 4025 cell sample I obtained from 
bSMZ to clone and sequence the relevant parts of the chromosomal 
DMA of these cells. 

12. The chain of custody of the cell sample and chromosomal DMA 
derived therefrom is set forth in my First Declaration and. the 
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DECLARATION OF MR. MASAO MASHITA UNDER 37 C.RR. 
§ 132 and of the DECLARATION OF. MR. YOSHITAKA MURATA 
UNDER 37 C.F.R. §1.132, both of which are being filed concurrently 
herewith. 

With respect to the sisquence work, I instructed Sawady. to utilize 

two prirner pairs designed by the coinventors for the cloning (by 

polymerase chain reaction (PCR)) and sequencing of Enzyme A 

(Primers for Analysis 1) and of Enzyrne A" (Primers for Analysis 2) 

as described below. 

Primers for Analysis i: (for Enzyme A) 

Forward: A697f 5 ' - tacgaagccc gttggatgac -3' 

Revisrse: AlOOOr 5'- tcgggttgat cgactgcaga -3' 

Primers for Analysis 2: (for Enzyme A") 

Forward: A"479f 5 ' - tattcgacgt cgatcgcggt -3 ' 

Reverse: A"780r 5'- aactgctgag gtgccgtagt -3' 

The Primers for Analysis 1 were designed to amplify (by PCR) the 
region from nucleotide (nt) position 697 to nt position 1000 of the 
gene encoding Enzyme A and to detemnine the amplified nucleotide 
sequence having 304 bases including the nucleotide at position 852. 
The primers for Analysis 2 were designed to amplify (by F>CR> the 
region from nt position 479 to nt position 780 of the gene encoding 
Enzyme A" and to detemriine. the amplified nucleotide sequence 
having 302 bases including the nucleotide at position 644. 

The primer infomiation was provided to Mr. Masao Mashita at 
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Sawady. together with a sample of the original microorganism DSM 
4025 disclosed in the '667 application (and obtained through DSMZ) 
to facilitate the cloning and sequencing of the relevant nucleotides 
for Enzyme A (SEQ ID NO: 1 ) and Enzyme A" (SEQ ID NO:3). (See 
my First Declaration). 

16. On October 13, 2000, I received from Sawady, via Mr. Itoh. ari 
Experimental Report (non-finalized) including the sequence data, 
which are set forth in Exhibit 5. From the anti-parallel alignnient of 
the (+) and (-) strands in combination with the sequence information 
- of the primers used, I confirmed the (correctness of the two 
nucleotide sequences set forth below. For determining each 
sequence, I took into consideration that at positions downstream of 
each primer used in the PGR sequencing carried out by Sawady, the 
nucleotide reading on the sense strand was not absolutely reliable, 
and thus for each such region, the data from the complementary 
. sequence was used: 

Ex. (1): A697-1000 Sawady [304 bp] (corresponds to Enzyme A, Le: 
SEQ ID NO: 1) 

697 

TACGAAGCCC GTTGGATGAC CGGTGCCTGG GGCCAGATCA CGTATGACCC 

CGTCACCAAC CTTGTCCACT ACGGCTCGAC CGCTGTGGGT CCGGCGTCGG 

AAACCCAACG CGGCACCCCG GGCGGCACGC TGTACGGCAC GAACACCCGT 
852 

TTCGCCGTGC GTCCTGACAC GGGCGAGATT GTCTGGCGTC ACCAGACCCT 
GCCCCGCGAC AACTGGGACC AGGAATGCAC GTTCGAGATG ATGGTCACCA 
ATGTGGATGT CCAACCCTCG ACCGAGATGG AAGGTCTGCA GTCGATCAAC 
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1000 

ccga' ..... ■ . • . ' ■ ■ ■ 

17. The correctness of the above-identified sequence was verified wft^ 
the two nucleotide sequences (41 F903 and 39F903) (Exhibit 5) and 
. two prirner sequences (A697f and AlOOOr): 

* Nucleotides 697-71 6 of Ex(1 ) above are the same as nucleotides 
1-20 of primer A697f; 

* Nucleotides 717-966 of Ex(lj above are the same as nucleotides 
41-290 of the complementary sequence of 41 F903; 

■ * Nucleotides 967-980 of Ex{1) above are the same as nucleotides 
253-266 of 39F9b3; and 

* Nucleotides 981-1000 of Ex(1) above are the same as 
nucleotides 1-20 of the complementary sequence of primer 
AlOOOr. 

Ex. (2): A"479-780 Sawady [302 bp] (corresponds to Enzyme A", he. 
SEQIDNO:3) 

479 

TATTCGACGT CGATCGCGGT CAAGGCACGG ATATGGTCTC. GAACTCGTCC 

GGCCCGATTG TCGCCAATGG CGTCATCGTT GCGGGCTCGA CCTGTCAGTA 

TTCGCCGTTC GGCTGTTTCG TTTCGGGCCA CGACTCGGCC ACCGGTGAAG 
644 

AGCTGTGGCG CAACACCTTT ATCCCGCGCG CCGGCGAAGA GGGTGATGAG 

ACCTGGGGCA ATGATTACGA GGCCCGCTGG ATGACCGGCG TTTGGGGCCA 

GATCACCTAT GACCCCGTTG GCGGCCTTGT CCACTACGGC ACCTCAGCA6 
780 

TT . 
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18, The correctness of the above-identified isequence was verified with . 
two nucleotide sequences (45F903 and 43F0d3) (Exhibit 5) and two 
primer sequences (A"479f and A"780r): 

*. Nucleotides 479-498 of Ex(2) above are the same as nucleotides 
1 -20 of primer A!'479f; 

* . Nucleotides 499-728 of Ex(2) above are the same as nucleotides 

41-270 of the complementary sequence of 45F903; 

* Nucleotides 729-760 of Ex(2) above are the same as nucleotides 
228-259 of 43F903; and 

* Nucleotides 761-780 of Ex(2) above are. the same as nucleotides 
1-20 of the complementary sequence of prirnerA"780r. 

19. Based on my knowledge and experience, and in view of the results 
presented herein, It is my opinion that SEQ ID NOs:1 and 3 of the 
'667 application each contain a single nucleotide, error introduced by 
a typing mistake. The single mistake in SEQ ID NO:1 resulted in no 
error in the amino acid sequence of SEQ ID NO: 5. The single 
mistake in SEQ ID NO: 3 when translated resulted in a single amino 
acid error in SEQ ID NO: 7. Each of these errors is readily 
identifiable to one of skill in the art by clonjng and sequencing the 
chromosomal DNA of the same microorganism used in the '667 
application, which is publicly available. The identification of each of 
these errors is summarized in more detail below: 
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(a) SEQlbNO: 1 

By comparing the nucleotide sequence Identified above as "A697- 
1000 Sawady [304 bp]" with the nucleotide sequence recited In the 
original nucleotide printout from the. nucleotide sequencing machine 
(Exhibit 3), I confirmed that the nucleotides from positions 697-1000 
In each sequence are identical. Therefore, the nucleotide at 
position 852 in SEQ ID NO: 1 ("G") Is incorrect and should read "C." 
This error had no effect on the corresponding deduced amino acid 
sequence In SEQ ID NO:5: 

(b) SEQ ID NO: 3 

By comparing the nucleotide sequence identified above as "A"479- 
780 Sawady [302 bp]" with the nucleotide sequence recited In the. 
original nucleotide printout from the nucleotide sequencing machine 
(Exhibit 4), I confirmed that the nucleotides from positions 479-780 
in each sequence are Identical. Therefore, the nucleotide at position 
644 in SEQ ID NO: 3 ("A") Is Incorrect and should read "C." 

(c) SEQIDNO:7 

Based on the correct nucleotide sequence for SEQ ID NO: 3, the 
triplet codon recited as "AAC" of nucleotide positions 643-645 in 
SEQ ID NO:3 should read "ACC." This should be reflected in the 
corresporiding deduced amino acid sequence (SEQ ID NO: 7) at 
amino acid position 192, which was recited as "Asn" In the current 
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Sequence Listing. The con-ect codon ("ACC"), however, 
corresponds to. the amino acid "Thr." Therefore,, the amino acid at 
position 192 in SEQ ID NO:7 ("Asn") Is Incorrect and should read 

"Thr.".-- 

In sum, on the basis of the data presented herein, resequencing of 
the relevant parts of the chromosomal DNA of a sample of the same 
microorganism from which SEQ ID NOs:1 and 3 were isolated as 
disclosed in the '667 application (i.e., Gluconobacter oxydans strain 
DSM 4025) confimns that typographical mistakes resulted in the 
following errors found in SEQ ID NOs: 1, 3, and 7, and that such 
en-ors would be readily identified by one skilled in this art using 
publicly available starting materials and routine skill. Accordingly, in 
my opinion, one skilled in this art would recognize, after 
resequencing the relevant parts of the chromosomal DNA of DSM 
4025 that: 

(1) The nucleotide at position 852 of SEQ ID NO:1, which 
currently recites "G," should recite "C." 

(2) The nucleotide at position 644 . of SEQ ID NO:3, which 
currently recites "A," should recite "C." 

(3) The amino acid at position 192 of SEQ ID NO:7, which 
currently recites "Asn," should recite 'Thr." 
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I declare further that all statements made herein of my own knowledge are 
true and that all statements made on information and belief are believed to be true and 
further that these staternents were made with the knowledge that willful false statements 
and the like so made are punishable by fine or imprisonrhent, or both, under Section 
1001 of Title 18 of the United States Code and that such willful false statements may 
jeopardize the validity of the application or any patent issuing thereon. 

Dated: rl^:/>^^^ /(f^. ci^oo2 



Masako S^tfnjoh . 
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CURRICULUM VITAE of Masakp Shiiijoh 



As of August 28, 2002 



Scientist 

Depajtment of Api)lied Microbiology 
Nippon Roche K.K. 
Nippon Roclie Research Center 
200 Kajiwara kamakura, Japan 
247-8530. 

Phone: +81-467-47-2226 
FAX: +81-467-45-6812 . 
E-mail: masako.shinjoh@roche.com . • 

Education & Research Experience: 

la. : ;Sciehtist (April 1979 to\date) at Dept . 6f Applied Microbiology, Nippon Roche K.K., 
Nippon Roche Research Center, at Kaniakura, Japan, which belongs to Vitamin and Fine 
Chemical Division in Hoffmann-La Roche 

This work, includes improvement of microorganisms producing vitamin or its precursor by 
conventional method and genetic engineering. 

lb. Visiting scientist (Jan. to March 1982) at Research Institute of Molecular Biology at 
Nutley, NJ, USA, which belonged to Hoffmann-La Roche. Objectives: to exchange 
scientific information and technical transfer of genetic engineering skills. 

2. Ph.D. (Jan. 12, 1996) 

Ph.D. in Engineering from Department of Fermentation Technology, Osaka University, 
Osaka, Japan. 

The title of the Thesis is "Metabolic engineering for 2-keto-L-gulonic acid production in 



Gluconobact'er'\ 



3. Master Degree (April 1 977 to Marnh 1 Q7Q^ . 

Master in Engineering from Departmeiit of Fermentation Technology, Osaka University, 
Osaka, Japan. 

The projects involved were 

"Characterization of bacteriophage of bacitracin-produing fiaciV/w^^ 

/^Application of plasmid on fermentation production: factors responsible for stabilization of 
hybrid plasmids cariyinjg trytophane operon in E. coli ; 

4. Bachelar Degree TApril 1 Q75 to March 1 977^ 

Department of Fermentation Technology, Osaka University, Osaka, Japan. 5 
The projects involved were 
"In vitro synthesis of alpha-amylase of Bacillus'" 

5. Professional field 

Microbiology 
Fermentation technology 
Genetic engineering 

6- Memberships 

a) The Society for Bioscience and Bioengineering 

b) Japan Society for Bioscience, Biotechnology, and Agrochemistry 

7, Personalinfomiation: 
Female, 

Japanese citizen. 
Birthday: 20th February, 1955 



LIST OF PUBLICATIONS 



Original Papers bv the Author 

Shinjoh, M., Y. Setoguchi, T. Hoshino and A. Fujiwara, (1990) 

L-Sorbose dissimilation in 2-keto-L-gulonic acid-producing mutant UVIO derived from 
Gluconobacter melanogenus IFO 3293. Agric, Biol. Chem. 54: 2257 - 2263. 

Shinjoh, M., T Sugis^wa, S. Masuda, and T. Hoshino. (1994) / 

Efficient cpnversion of L-sorbpsone to 2.^keto-L-gulonic acid by Acetobacter liquefaciens 

strains. J, Ferment. Bioeng. 78: 476 - 478. 

Shinjoh, M., and T. Hoshino. (1995). Development of a stable shuttle vector and a 
conjugative transfer system for Gluconobacter oxydansJ: Ferment. Bioeng. 79: 95 - 99. 

Shinjoh, M., N. Tomiyama, A. Asakura, and T. Hoshino. (1995) Cloning and nucleotide 
sequencing of membrane-bound L-sorbosone dehydrogenase gene of Acetobacter 
liquefaciens IFO 12258 and its expression in Gluconobacter oxydans. AppL Environ. 
Microbiol. 43: 1064 - 1069. 

Shinjoh, M, M., Tazoe, and T. Hoshino. (2002) NADPH-dependent L-sorbose reductase is 
responsible for L-sorbose assimilation in Gluconobacter suboxydans IFO 3291. J. of 
BacterioL^ 84: 861 - 863. 

Miyazaki, T., N. Tomiyama, M. Shinjoh, and T. Hoshino. (2002) Molecular cloning and 
functional expression of D-sorbitol dehydrogenase from Gluconobacter suboxydans 
IF03255 which requires PQQ and hydrophobic protein SldB for the activity development in 
E.coli, (2001) Biosci. Biotechnol. Biochem. 66: 262-270. (the corresponding author) 

Shinjoh, M., N. Tomiyama, T. Miyazaki, and T. Hoshino. (2002) Main polyol 
dehydrogenase of Gluconobacter suboxydans IFO 3255, membrane-bound D-sorbitol 
dehydrogenase, that needs product of upstream gene, sldB, for activity. Biosci. Biotechnol. 
Biochem. (in press) 
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Sugisawa, T.,.T. Hoshino, S. Masuda, S. Nomura, Y. Setoguchi, M. Tazoe, M. Shinjoh, S. 
Someha and A. Fujiwara. (1990) Microbial production of 2-keto-L-guloriic acid from L- 
sorbose and D-sorbitol by Glucohobacier oxydahs, Agric. Biol. Chem. 54: 1201 - 1209. 
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Other Publications on the work done at Os aka Univ. 

Imanaka^ T., K. Uchida, M. Tateishi (Shinjoh), and S. Aiba. (1979) 

Inducible bacteriophage of Bacillus licheniformis ATCC 10716. Virology 95: 249 - 250. 

Tsunekawa, H., M. Tateishi (Shinjoh), T. Imanaka, S. Aiba. (1981) TnA-directed deletion 
of ttie trp operon from RSF2 124-trp in Escherichia coli. 

Patent publication: USP granted including "M Shir ^joh" as the inventor 

(as of Aug. 28, 2002) 

PAT. NO. Title 

1 6,407,203 Cytochrome c and polynucleotides encoding cytochrome c 

2 6,146,860 Manufacture of L-ascorbic acid and D-erythorbic acid 

3 6,127,156 D-sorbitol dehydrogenase gene 

4 6,037, 147 Cytochrome c and polynucleotides encoding cytochrome c 

5 5,541,108 Gluconobacter oxydans strains 



6 5,399,496 t>NA shuttle vectors for E. coU,Gluconobacter, and Acetobacter 

7 5,352,599 Co-enzyme-independent L-sorbosone dehydrogenase of Glucpnobacter 

oAyrfan^: isolation, characterization, and cloning and autologus expression 
of the genie 
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SEQUENCE LISTING / 
(1) GENERAL INFORMATION 

(i) APPLICANT 

NAME: F. HOFFMANN-LA ROCHE AG 

STREET: Grezacherstrasse 124 

CITY: Basle 
COUNTRY: : Switzerland 
POSTAL CODE: CH-4002 
TELEPHONE: 061-688 2511 
FAX: 061^68813 95 

TELEX: 962292/965542 hlr c 

(ii) TITLE OF INVENTION: 

Alcohol/Aldeliyde dehydrogenase gfenes 

(iii) NUMBER OF SEQUENCES: 8 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: Macintosh 

(C) OPERATING SYSTEM* 

(D) SOFTWARE: MS word ver 5.1 

(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION 
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CCAAGGCGAA 


GACATGGTTT 


CGAACTCGTC GGGCCCGATC 


GTGGCAAACG 


550 


GCGTGATCGT 


TGCCGGTTCG 


ACCTGCCAAT ACTCGCCGTT 


CGGCTGCTTT. 


600 


GTCTCGGGCC 


acgactc6gc 


CACCGGTGAA GAGCTGTGGC 


GCAACTACTT 


650 


CATCCCGCGC 


GCTGGCGAAG 


AGGGTGATGA 


GACTTGGGGC 


AACGATTACG. 


700 


. AAGCCCGTTG 


GATGACCGGT 


GCCTGGGGCC 


AGATCACCTA 


TGACCCCGTC 


750, 


ACCAACCTTG 


TCCAGTACGG 


CTCGACCGCT 


GTGGGTCCGG 


CGTCGGAAAC 


800 


CCAACGCGGC 


ACCCCGGGCG 


GCACGCtGTA 


CGGCACGAAC 


ACCCGTTTCG 


850 


CGGTGCGTCC 


TGACACGGGC 


GAGATTGTCT 


GGCGTCACCA. 


GACCCTGCCC 


900 


CGCGACAACT GGGACCAGGA ATGCACGTTC 


GAGATGATGG 


. TCACCAATGT 


950 


GGATGTCCAA 


CCCTCGACCG 


AGATGGAAGG 


TCTGCAGTCG 


ATCAACCCGA 


1000 


ACGCCGCAAC 


TGGCGAGCGT 


CGCGTGCTGA 


CCGGCGTTCC 


GTGCAAAACC 


1050 


GGCACCATGT 


GGCAGTTCGA 


CGCCGAAACC 


GGCGAATTCC 


TGTGGGCCCG 


1100 


TGATACCAAC 


TACCAGAACA 


TGATCGAATC 


CATCGACGAA 


AACGGCATCG 


1150 


TGACCGTGAA 


CGAAGATGCG 


ATCCTGAAGG AACTGGATGT 


TGAATATGAC 


1200 


GTCTGCCCGA 


CCTTCTTGGG 


cggccgcgAc 


TGGCCGTCGG 


CCGCACTGAA 


1250 


CCCCGACAGC 


GGCATCTACT 


TCATCCCGCT 


GAACAACGTC 


TGCTATGACA 


1300 


TGATGGCCGT 


CGATCAGGAA 


TTCACCTCGA 


TGGACGTCTA 


TAACACCAGC. 


1350 


AACGTGACCA 


AGCTGCCGCC 


CGGCAAGGAt 


ATGATCGGTC 


GTATTGACGC 


1400 


GATCGACATC 


AGCACGGGTC 


GTACGCTGTG 


GTCGGTCGAA 


CGTGCTGCGG 


1450 


CGAACTATTC 


GCCCGTCTTG 


TCGACCGGCG 


GCGGCGTTCT 


GTTCAACGGT 


1500 


GGTACGGATC 


GTTACTTCCG 


CGCCCTCAGC 


CAAGAAACCG 


GCGAGACCCT 


1550 


GTGGCAGACC 


CGCCTTGCAA 


CCGTCGCGTC 


GGGCCAGGCC 


ATCTCTTACG 


1600 


AGGTTGACGG 


CATGCAATAT 


GTCGCCATCG 


CAGGTGGTGG 


TGTCAGCTAT 


1650 


GGCTCGGGCC 


TGAACTCGGC 


ACTGGCTGGC 


GAGCGAGTCG 


ACTCGACCGC 


1700 


CATCGGTAAC 


GCCGTCTACG 


TCTTCGCCCT 


GCCGCAATAA 


1740 
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INFORMATION FOR SEQ ID Nb:2: 

. '. (i) SEQUENGE CHARACTERISTICS: 

(A) XJSNGTH: 1740 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(P) • TOPOLOGY: ' linear ■ 

(ii) MOLECULE TYPE: DNA (genomic). 

(iii) ORIGINAL SOURCE: 

ORGANISM: Giueonobacter oxydcais 
STRAIN: DSM4025 

(iv) FEATURE: 

FEATURED KEY: CDS 
POSIliOIN: 1..1737 
SEQUENCING METHOD: E 

ATGAAGACGT CGTCTTTGGT GGTTGCGAGC GTTGCCGCGC TTGCAAGCTA 50 

TAGCTCCTTT GCGCTTGCTC AAGTGACCCC CGTCACCGAT GAATTGCTGG 100 

CGAACCCGCC CGCTGGTGAA TGGATCAGCT ACGGTCAGAA CCAAGAAAAC 150 

TACCGTCACT CGCCCCTGAC GCAGATCACG ACTGAGAACG TCGGCCAACT 200' 

GCAACTGGTC TGGGCGCGCG GCATGCAGCC GGGCAAAGTC CAAGTCACGC 250 

CCCTGATCCA TGACGGCGTC ATGTATCTGG CAAACCCGGG CGACGTGATC 300 
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CAGGCCATCG 


ACGCCAAAAC 


TGGCGATCTG 


ATCTGGGAAC 


ACCGCCGCCA 


350 


ACTGCCGAAC 


ATCGCCACGC 


TGAACAGCTT 


TGGCGAGCCG 


ACCCGCGGCA 


400 


TGGCGCTGTA 


CGGCACdAAC 


GTTTACTTTG 


TTTCGTGGGA 


CAACCACCTG 


450 


GTCGCCCTCG ACACCGCAAC 


TGGCCAAGTG 


ACGT^rCGACG 


TCGACCGCGG 


5.00 


CCAAGGCGAA 


GACATGGTTT CGAACTCGTC 


GGGCCCGATC 


GTGGCAAACG 


550 


GCGTGATCGT 


TGCCGGTTCG 


ACCTGCCAAT 


ACTCGCGGTT 


CGGCTGCTTT 


600 


GTCTCGGGCC 


ACGACTCGGC 


CACCGGTGAA 


GAGCTGTGGC 


GCAACTACTT 


650 


CAT CCCGCGC GCTGGCGAAG AGGGTGATGA 


. GACTT(5GGGC 


AACGATTACG 


700 


AAGCCCGTTG 


GATGACCGGC 


GTCTGGGGTC 


AGATCACCTA 


TGACCCCGTT 


750 


GGCGGCCTTG 


TCCACTACGG 


CTCGTCGGCT 


GTTGGCCCGG 


CTTCGGAAAC 


800 


C C- ACjCGCGGC 


ACCACCGGCG GCACCATGTA* CGGCACCAAC 


ACCCGTTTCG 


850 


CTGTCCGTCC 


CGAGACTGGC 


GAGATCGTCi* 


GGCGTCACCA 


AACTCTGCCC 


900 


CGdGACAACT 


GGGACCAAGA 


GTGCACCTTC 


GAGATGATGG 


TTGCCAACGT 


950 


TGACGTGCAG 


CCCGCAGCTG 


ACATGGACGG 


CGTCCGCTCG 


ATCAACCCGA 


1000 


ACGCCGCCAC 


CGGCGAGCGT 


CGCGTTCTGA 


CCGGCGTTCC 


GTGCAAAACC 


1050 


GGCACCATGT 


GGCAGTTCGA 


CGCCGAAACC 


GGCGAATTCC 


TGTGGGCCCG 


1100 


TiGAGACCAGC 


TACGAGAACA 


TCATCGAATC 


GATCGACGAA 


AACGGCATCG 


1150 


TGACCGTCGA 


CGAGTCGAAA 


GTTCTGACCG 


AGCTGGACAC 


CCCCTATGAC 


1200 


GTCTGCGCGC 


TGCTGCTGGG 


TGGCCGTGAC 


TGGCCGTCGG 


CTGCGCTGAA 


1250 


CCCCGATACC 


GGCATCTACT 


TTATCCCGCT GAACAACACC TGCATGGATA 


1300 


TCGAAGCTGT 


CGACCAGGAA 


TTCAGCTCGC 


TGGACGTGTA 


CAACCAAAGC 


1350 


CTGACCGCCA 


AAATGGCACC . 


GGGTAAAGAG 


CTGGTTGGCC 


GTATCGACGC 


1400 


CATCGACATC 


AGCACAGGCC 


GCACCCTGTG 


GACCGCTGAG 


CGCGAAGCCT 


1450 


CGAACTACGC 


GCCTGTCCTG 


TCGACCGCTG 


GCGGCGTTCT 


GTTCAACGGC 


1500 


GGCACCGACC 


GTTACTTCCG 


CGCTCTCAGC 


CAAGAGACCG 


GCGAGACCCT 


1550 
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GTGGCAGACC CGTCTGGCGA CTGTCGCTTC GGGCCAAGCT GTCTCGTACG 1600 
AGATCGACGG CGTCCAATAC ATCGCCATCG GCGGCGGCGC CACGACCTAT . 1 6 5 0 
GGTTCGTTCC ACAACCGTCC CCTGGCCGAG CCGGTCGACT CGACCGCGAT 1700 
CGGTAATGCG ATGTACGTCT TCGCGCTGCC CCAGCAATAA 1740 



INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1737 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genoxnic) 

(iii) ORIGINAL SOURCE: 

ORGANISM: Gluconob€icier oxydans 
STRAIN: DSM4025 

(iv) FEATURE: 

FEATURE KEY: CDS 
POSrilOIN: 1..1734 
SEQUENCING METHOD: E 

ATGAAACTGA CGACCCTGCT GCAAAGCAGC GCCGCCCTGC TTGTGCTTGG 50 
CACCATTCCC GCCCTTGCCC AAACCGCCAT CACCGATGAA ATGCTGGCGA 100 
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ACGCGCCCGC 


TGGTGAATGG 


ATCAACTACG 


GTCAGAACCA 


AGAGAACTAC 


150 


CGCCACTCGC 


CCCTGACGCA 


GATTACCGCA 


GACAACGTCG 


GCCAACTGCA 


200 


ACTGGTCTGG 


GCGCGCGGTA 


TGGAAGCGGG 


CAAGATCCAA 


GTGACCCCGC 


250 


TTGTCCATGA 


CGGCGTCATG 


TATCTGGCAA 


ACCCCGGTGA 


CGTGATCCAG 


.300 


GCCATCGACG 


CCGCGACCGG 


CGATCTGATC 


TGGGAACACC 


GCCGCCAACT 


350 


GCCGAACATC 


GCCACGCTGA 


ACAGCTTTGG 


TGAGCCGACC 


CGCGiGCATGd 


400 


CCCTCTATGG 


CACCAACiGTC 


TATTTCGTCT 


CGTGGGACAA 


CCACTTGGTC 


450 


GCGCTGGACA 


CCTCGACCGG. 


CCAAGTCGTA 


TTCGACGTCG 


ATCGCGGTCA 


. 500 


AGGCACGGAT 


.ATGGTCTCGA 


ACTCGTCCGG 


CCCGATTGTC 


GCCAATGGCG 


550 


TCATCGTTGC 


GGGCTCGACC 


TGTCAGTATT 


CGCCGTTCGG 


CTGTTTCGTT 


600 


TCGGGCCACG 


ACTCGGCCAC 


CGGTGAAGAG 


CTGTGGCGCA 


ACAACTTTAT 


650 


CCCGCGCGCC 


GGCGAAGAGG 


GTGATGAGAC 


CTGGGGCAAT 


GATTACGAGG 


700 


CCCGCTGGAT 


GACCGGCGTT 


TGGGGCCAGA 


TCACCTATGA 


CCCCGTTGGC 


.750 


GGCCTTGTCC 


ACTACGGCAC 


CTCAGCAGTT 


GGCCCTGCGG 


CCGAGATTCA 


.800 


GCGCGGCACC 


GTTGGCGGCT 


CGATGTATGG 


CACCAACACC 


CGCTTTGCTG 


850 


TCCGCCCCGA 


GACGGGCGAG 


ATCGTCTGGC 


GTCACCAAAC 


TCTGCCCCGC 


900 


GACAACTGGG 


ACCAAGAGTG 


TACGTTCGAG 


ATGATGGTCG. 


TCAACGTCGA 


950 


CGTCCAGCCC 


TCGGCTGAGA 


TGGAAGGCCT 


GCACGCCATC 


AACCCCGATG 


1000 


CCGCCACGGG 


CGAGCGTCGC 


GTTGTGACCG 


GCGTTCCGTG 


CAAGAACGGC 


1050 


ACCATGTGGC 


AGTTCGACGC 


CGAAACCGGC 


GAATTCCTGT 


GGGCGCGCGA 


1100 


CACCAGCTAT 


CAGAACCTGA 


TCGAAAGCGT 


CGATCCCGAT 


GGTCTGGTGC 


1150 


ATGTGAACGA 


AGATCTGGTC 


GTGACCGAGC 


TGGAAGTGGC 


CTATGAAATC 


1200 


TGCCCGACCT 


TCCTGGGTGG 


CCGCGACTGG 


CCGTCGGCTG 


CGCTGAACCC 


1250 


CGATACTGGC 


ATCTATTTCA 


TCCCGCTGAA 


CAACGCCTGT 


AGCGGTATGA 


1300 


CGGCTGTCGA 


CCAAGAGTTC 


AGCTCGCTCG 


ATGTGTATAA 


CGTCAGCCTC 


1350 
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GACTATAAAC TGTCGCCCGG .TTCGGAAAAC ATGGGCCGTA TCGACGGCAT 1400 
CGACATCAGC ACCGGCCGCA .CGCTGTGGTC GGCTGAACGC TACGCCTCGA 1450 
ACTACGqGCC TGTCCTGTCC ACCGGCGGCG GCGTGCTGTT CAACGGCGGC 1500 
ACCGACCGTT ACTTCCGCGC CCTCAGCCAA GAGACCGGCG AGACGCTGTG 1550 
GCAGACCCGT CTGGCGACTG TCGCCTCGGG TCAAGCGATT TCCTATGAGA 1600 
TCGACGGCGT GCAATATGTC GCCATCGGGC GCGGCGGCAC CAGCTATGGC 1650 
AGCAACCACA ACCGCGCCCT GACCGAGCGG ATCGACTCGA CCGCCATCGG 1700. 
CAGCGCGATC TATGTCTTTG CTCTGCCGCA GCAGTAA- 1737 



INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1740 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) ORIGINAL SOURCE: 

ORGANISM: Gluconobcutter oxydans 
STRAIN: DSM4025 

(iv) FEATURE: 

FEATURE KEY: CDS 
POSmOIN: 1..1737 
SEQUENCING METHOD: E 
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ATGAACCCCA 


CAACGCTGCT 


TCGCACCAGC 


GCGGCCGTGC 


TATTGCTTAC 


50 


CGCGCCCGCC 


GCATTCGCGC 


AGGTAACCCC 


GATTACCGAT 


GAACTGCTGG 


100 


CGAACCCGCC 


CGCTGGTGAA 


TGGATTAACT 


ACGGCCGCAA 


CCAAGAAAAC 


150 


TATCGCCACT 


CGCCCCTGAC 


CCAGATCACT 


GCCGACAACG 


TTGGTCAGTT 


200 


GCAACTGGTC 


TGGGCCCGCG 


GGATGGAGGC 


GGGGGCCGTA 


CAGGTCACGC 


250 


CGATGATCCA 


TGATGGCGTG 


ATGTATCTGG 


CAAACCCCGG 


TG ATGTGATC 


300 


CAGGCGCTGG 


ATGCGCAAAC 


AGGCGATCTG 


ATCTGGGAAC 


ACCGCCGCCA 


-J J w 


ACTGCCCGGC 


GTCGCCACGC 


TAAACGCCCA 


AGGCGACCGC 


aagcgcggCg 


rxVJ \J 


TCGCCCTTTA 


CGGCACGAGC 


CTCTATTTCA 


GCTCATGGGA 


caaccatctg 


J V 


ATCGCGCTGG 


ATATGGAGAC 


GGGCCAGGTC 


GTATTCGATG 


TCGAACGTGG- 


500 


ATCGGGCGAA 


GACGGCTTGA 


CCAGTAACAC 


CACGGGGCCG 


ATTGTCGCCA 


550 


ATGGCGTCAT 


CGTCGCGGGT 


TCCACCTGCC 


AATATTCGCC 


CTATGGATGC 


600 


TTTATCTCGG 


GGCACGATTC 


CGCGACGGGT 


GAGGAGCTGT 


GGCGCAACCA 


650 


CTTTATCCCG 


CAGCCGGGCG 


AAGAGGGTGA. 


CGAGACTTGG 


GGCAATGATT 


700 


TCGAGGCGCG 


CtGGATGACC 


GGCGTCTGGG 


GTCAGATCAC 


CTATGATCCC 


750 


GTGACGAACC 


TTGTGTTCTA 


TGGCTCGACC 


GGCGTGGGCC 


CAGCGTCCGA 


800 


AACCCAGCGC 


GGCACGCCGG 


GCGGCACGCT 


GTATGGCACC 


AACACCCGCT 


850 


TTGCGGTGCG 


TCCCGACACG 


GGCGAGATTG 


TCTGGCGTCA 


CCAGACCCTG 


900 


CCGCGCGACA 


ACTGGGACCA 


AGAATGCACG 


TTCGAGATGA 


TGGTCGCCAA 


950 


CGTCGATGTG 


CAACCCTCGG 


CCGAGATGGA 


GGGTCTGCGC 


GCCATCAACC 


1000 


CCAATGCGGC 


GACGGGCGAG 


CGCCGTGTGC 


TGACGGGTGC 


GCCTTGCAAG 


1050 


ACCGGCACGA 


TGTGGTCGTT 


TGATGCGGCC 


TCGGGCGAAT 


TCCTGTGGGC 


1100 


GCGTGATACC 


AACTACACCA 


ATATGATCGC 


CTCGATCGAC 


GAGACCGGCC 


1150 


TTGTGACGGT 


GAACGAGGAT 


GCGGTGCTGA 


AAGAGCTGGA 


CGTTGAATAT 


1200 
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GACGTCTGCC 


CGACCTTCCT 


GGGTGGGCGC 


GACTGGTCGT 


CAGCCGCACT 


1250 


GAACCCGGAC 


ACCGGCATTT 


ACTTCTTGCC 


GCTGAACAAT' 


GCCTGCTACG 


1300 


ATATTATGGC 


CGTTGATCAA 


GAGTTTAGCG 


GGCTCGACGT 


CTATAACACC 


1350 


AGCGCGACCG 


CAAAACTCGG 


GCCGGGCTTT 


GAAAATATGG 


GCCGCATCGA 


1400 


CGCGATTGAT 


ATCAGCACCG 


GGCGCACCTT 




GAGCGCCCTG 


1450 


CGGCGAACTA 


CTCGCCCGTT 


TTGTCGACGG 


CAGGCGGTGT 


GGTGTTCAAC 


1500 


GGCGGGACCG 


AGCGCTATTT 


CCGTGCCCTC 


AGCCAGGAAA 


CCGGCGAGAC 


1550 


TTTGTGGCAG 


GCCCGTCTTG 


CGACGGTCGC 


GACGGGGCAG 


GCGATCAGCT 


1600 


ACGAGTTGGA CGGCGTGCAA 


TATATCGCCA 


TCGGTGCGGG 


CGGTCTGACC 


1650 


TATGGCACGC AATTGAACGC 


GCCGCTGGCC 


GAGGCAATCG 


ATTCGACCTC 


1700 


GGTCGGTAAT 


GCGATCTATG 


TCTTTGCACT 


GCCGCAGTAA 


1740 



INFORMATION. FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 579 residiiies " 

(B) TYPE: amino acid 

. (C) TOPOLOGY: • linear 

(ii) MOLECULE TYPE: protein 

(iii) ORIGINAL SOURCE: 

ORGANISM: Gluconobdcter oxydans 
STRAIN: BSM4025 

(iv) FEATURE: 
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FEATURE KEY: sig peptide 
POSITION: .23..-1 
SEQUiENCING METHOD: E 
FEATURE KEY: mat peptide 
POSITION: 1..556 . 
SEQiiENCING METHOtP: E 



Met Lys Pro Thr Ser Leu Leu Trp Ala Ser Ala Gly Ala Leu Ala 
-20 -15. ._io 

Leu Leu Ala Ala Pro Ala Phe Ala Gin Val Thr Pro Val Thr Asp 

■ -5 . 1 5 

Glu Leu Leu Ala Asn Pro Pro Aia Gly Glu Trp lie Ser Tyr Gly 
10 15 20 

Gin Asn Gin Glu Asn Tyr Arg His Ser Pro Leu Thr Gin lie Thr 
.25 30 35 

Thr Glu Asn Val Gly Gin Leu Gin Leu Val Trp Ala Arg Gly Met 
40 45 50 

Gin Pro Gly Lys Val. Gin Val Thr Pro Leu lie His Asp Gly Val 
55 .60 65 

Met Tyr Leu Ala Asn Pro Gly Asp Val lie Gin Ala lie Asp Ala 
70. 75 80 

Lys Thr Gly Asp Leu lie Trp 
85 .90 

lie Ala Thr Leu Asn Ser Phe 
100 105 

Leu Tyr Gly Thr Asn Val Tyr 
115 . . 120 

Val Ala Leu Asp Thr Ala Thr 
130. 135 



Glu His Arg Arg Gin Leu Pro Asn 
95 

Gly Glu Pro Thr Arg Gly Met Ala 
110 

Phe Val Ser Trp Asp Asn His Leu 
125 

Gly Gin Val Thr Phe Asp Val Asp 
140 
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Arg Gly Gin Gly Glu Asp Met Val Ser Asn Ser Ser Gly Pro He 
145 150 155 

Val Ala Asn Gly Val He Val Ala Gly Ser Thr Cys Gin Tyr Ser 

160 . 165 170 . . ' 

Pro Phe , Gly Cys Phe Val Ser Gly His Asp Ser Ala- Thr- Gly Glu 
175 . 180 185 

Glu Leu Trp Arg Asn. Tyr Phe He Pro Arg Ala Gly. Glu Glu Gly 
190 195 . . 200 

Asp Glu Thr Trp Gly Asn Asp Tyr Glu Ala Arg Trp Met Thr Gly 
. 205 210 215 

Ala Trp Gly Gin lie Thr Tyr Asp Pro Val Thr Asn Leu Val His 
220 225 230 

Tyr Gly Ser Thr Ala Val Gly Pro Ala Ser Glu Thr Gin Arg Gly 
235 240 245 

Thr Pro Gly Gly Thr Leu Tyr Gly Thr. Asn Thr Arg Phe Ala Val 
250 255 260 

Arg Pro Asp Thr Gly Glu He Val Trp Arg His Gin Thr Leu Pro 
265 270 275 

Arg Asp Asn Trp Asp Gin Glu Cys Thr Phe Glu Met Met Val Thr 
280 285 290 

Asn Val Asp Val Gin Pro Ser Thr Glu Met Glu Gly Leu Gin Ser 
295 300 305 

He Asn Pro Asn Ala Ala Thr. Gly Glu Arg Arg Val Leu Thr Gly 
310 315 320 

Val Pro Cys Lys Thr Gly Thr Met Trp Gin Phe Asp Ala Glu Thr 
325 330 335 

Giy Glu Phe Leu Trp Ala Arg Asp Thr Asn Tyr Gin Asn Met He 
340 345 350 

Glu Ser He Asp Glu Asn Gly He Val Thr Val Asn Glu Asp Ala 
355 . 360 365 

He Leu Lys Glu Leu Asp Val Glu Tyr Asp Val Cys Pro Thr Phe 
370 375 . 380 

Leu Gly Gly Arg Asp Trp Pro Ser Ala Ala Leu Asn Pro Asp Ser. 
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385 390 . 395 

Gly He Tyr Phe He Pro Leu Asn Asn Val Cys Tyr Asp Met Met 
40.0 405 410 

Ala Val Asp Gin Glu Phe Thr Ser Met Asp Val Tyr Asn Thr Ser 
415 420 425 . 

Asn Val Thr Lys Leu Pro Pro Gly Lys Asp Met He Gly Arg He 
. . 430 . 435 440 

Asp Ala He Asp He Ser Thr Gly Arg Thr Leu Trp Ser Val Glu 
. 445 450. 455 

Arg Ala Ala Ala Asn Tyr Ser Pro Val Leu Ser Thr Gly Gly Glv 
460 465 470 . ^ . 

Val Leu Phe Asn Gly Gly Thr Asp Arg Tyr Phe Arg Ala Leu Ser 
475 480 485 

Gin Glu Thr Gly Glu Thr Leu Trp Gin Thr Arg Leu Ala Thr Val 
490 495 500 

Ala Ser Gly Gin Ala He Ser Tyr Glu Val Asp Gly Met Gin Tyr 
505 510 515 

Val Ala He Ala Gly Gly Gly Val Ser Tyr Gly Ser Gly Leu Asn 
520 525 530 

Ser Ala Leu Ala Gly Glu Arg Val Asp Ser Thr Ala He Gly Asn 
535 540 545 

Ala Val Tyr Val Phe Ala Leu Pro Gin 
550 555 



INFOKMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTEMSTICS: 
(A) LENGTH: 579 residues ' 



13 



(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

(ii) MOIJECULE TYPE: protein 

(iii) ORIGINAL SOURCE: 

ORGANISM: Gluconobacter oxydans 

STRAIN: DSM4025 

(iv) FEATURE: 

FEATURE KEY: sig peptide 
POSITION: -23..-1 

SEQUENCING METHOD: S 
FEATURE KEY: mat peptide 
POSITION: 1..556 
SEQUENCING METHOD: S 



Met Lys Thr Ser Ser.Leu Leu Val Ala Ser Val Ala Ala Leu Ala 
. -20 -15 _io 

Ser Tyr Ser Ser Phe Ala Leu Ala Gin Val Thr Pro Val Thr Asp 
-.5 1 5 

Glu Leu Leu Ala Asn Pro Pro Ala Gly Glu Trp lie Ser Tyr Gly 
10 15. 20 

Gin Asn Gin Glu Asn Tyr Arg His Ser Pro Leu Thr Gin lie Thr 
25 30 35 

Thr Glu Asn Val Gly Gin Leu Gin Leu Val Trp Ala Arg Gly Met 
40 45 .50 

Gin Pro Gly Lys Val Gin Val Thr Pro Leu lie His Asp Gly Val 
55 . -60 65 
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Met tyr Leu Ala Asn Pro Gly Asp Val lie Gin Ala He Asp Ala 
70 75 80 

Lys Thr Gly Asp Leu He Trp Glu His Arg Arg Gin Leu Pro Asn 
85 . 90 95 

He Ala Thr Leu Asn Ser Phe Gly Glu Pro Thr Arg Gly Met Ala 
100 105 110 

Leu Tyr Gly Thr. Asn Val . Tyr Phe Val Ser Trp Asp Asn His Leu 
115 120 . 125 

Val Ala Leu Asp. Thr Ala Thr Gly Gin. Val Thr Phe Asp Val Asp 
130 . 135 140 . 

Arg Gly Gin Gly Glu Asp Met Val Ser Asn Ser Ser Gly Pro He 
145 150 155 

Val Ala Asn Gly Val .lie Val Ala. Gly Ser Thr Cys Gin Tyr Ser 
160 165. 170 

Pro Phe Gly Cys Phe Val Ser Gly His Asp Ser Ala Thr Gly Glu 
175 180 185 

Glu Leu Trp Arg Asn Tyr Phe He Pro Arg Ala Gly Glu. Glu Gly 
190 195 200 

Asp Glu Thr Trp Gly Asn Asp Tyr Glu Ala Arg Trp Met Thr Gly 
205 210 215 

Val Trp Gly Gin He Thr Tyr Asp Pro Val Gly Gly Leu Val His 
220 225 230 

Tyr Gly Ser Ser. Ala Val Gly Pro Ala Ser Glu Thr Gin Arg Gly 
235 240 245 

Thr Thr Gly Gly Thr Met Tyr Gly Thr Asn Thr Arg Phe Ala Val 
250 255 260 . 

Arg Pro Glu Thr Gly Glu He Val Trp Arg His Gin Thr Leu Pro 
265 270 275 

Arg Asp Asn Trp Asp Gin Glu Cys Thr Phe Glu Met Met Val Ala 
280 285 290 

Asn Val Asp Val Gin Pro Ala Ala Asp Met Asp Giy Val Arg Ser 
295 • 300 305 

He Asn Pro Asn Ala Ala Thr Gly Glu Arg Arg Val Leu Thr Gly 



15 



310 315 320. 

Val Pro Cys Lys Thr Gly Thr Met. Trp Gin Phe Asp Ala Glu Thr 
325 330. 335 

Gly Glu Phe Leu Trp Ala Arg Asp Thr Ser Tyr Glu Asn lie lie 
340 . 345 350 

Glu. .Ser He Asp Glu Asn Gly He Val Thr Val Asp Glu Ser Lys 
. 355 360 365 

Val Leu Thr Glu Leu Asp Thr Pro Tyr Asp Val Cys Pro Leu Leu 
.370 375 380 

Leu Gly Gly Arg Asp Trp Pro Ser Ala .Ala Leu Asn Pro Asp Thr^ 
385 390 395 

Gly lie Tyr Phe He Pro Leu Asn Asn Thr Cys Met Asp He Glu 
400 405 410 

Ala Val Asp Gin Glu Phe Ser Ser Leu Asp Val Tyr Asn Gin Ser 
415 420 425 

Leu Thr Ala Lys Met Ala Pro Gly Lys Glu Leu Val Gly Arg He 
430 435 440 

Asp Ala He. Asp He Ser Thr Gly Arg Thr Leu Trp Thr Ala Glu 
445 450 455 

Arg Glu Ala Ser Asn Tyr Ala Pro Val Leu Ser Thr Ala Gly Gly 
460 465 470 

Val Leu Phe Asn Gly Gly Thr Asp Arg Tyr Phe Arg Ala Leu Ser 
475 480 485 

Gin Glu Thr Gly Glu Thr Leu Trp Gin Thr Arg Leu Ala Thr Val 
490 495 500 

Ala Ser Gly Gin Ala Val Ser Tyr Glu He Asp Gly Val Gin Tyr 
505 510 515. 

He Ala He Gly Gly Gly Gly Thr Thr Tyr Gly Ser Phe His Asn 
520 525 53 0 

Arg Pro Leu Ala Glu Pro Val Asp Ser Thr Ala He Gly Asn Ala 
535 540 545 

Met Tyr Val Phe Ala Leu Pro Gin Gin 
550 .. 555 
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INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) Un^IGTH: 578 residues 

(B) TYPE: andnoacid 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) ORIGINAL SOURCE: 

ORGANISM: Gluconobacter oa^dans 
STRAIN: DSM 4025 

(iv) FEATURE: 

FEATURE KEY: sig peptide 
POSITION: -23..-1 

SEQUENCING METHOD: S 
FEATURE KEY: mat peptide 
POSITION: 1..555 
SEQUENCING METHOD: S 

Met Lys Leu Thr Thr Leu Leu Gin Ser Ser Ala Ala Leu Leu Val 
-20 -15 _io 

Leu. Gly Thr He Pro Ala Leu Ala Gin Thr Ala lie Thr Asp Glu 
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Met Leu Ala Asn Pro Pro Ala Gly Glu Trp lie Asn Tyr Gly Gin 
10 .15 20 . 

Asn; Gin Glu Asn Tyr Arg His Ser Pro Leu Thr Gin lie Thr Ala 

25 . . 30 . 35 

Asp Asn Vai Gly Gin Leu Gin Leu Val Trp Ala Arg Gly Met Glu 
40 45 . 50 . 

Ala Gly Lys lie Gin Val Thr Pro Leu Val His Asp Gly Val Met 
5^ 60 . 65 

Tyr Leu Ala. Asn Pro Gly Asp Val' lie Gin Ala lie Asp Ala Ala 
70 75 .80 

Thr Gly Asp Leu He Trp Glu His Arg Arg Gin Leu Pro Asn He 
85 . 90. 95 

Ala Thr Leu Asn Ser Phe Gly Glu Pro Thr Arg Gly Met Ala Leu 
100 105 110 

Tyr Gly Thr Asn Val Tyr Phe Val Ser Trp Asp Asn His Leu Val 
115 120 125 

Ala Leu Asp Thr Ser Thr Gly Gin Val Val Phe Asp Val Asp Arg 
130 135 .. 140 

Gly Gin. Gly Thr Asp Met Val Ser Asn Ser Ser Gly Pro He Val 
145 150 155 . 

Ala Asn Gly Val He Val Ala Gly Ser Thr Cys Gin Tyr Ser Pro 
160 165 170 

Phe Gly Cys Phe Val Ser Gly His Asp Ser Ala Thr Gly Glu Glu 
175 180 185 

Leu Trp Arg Asn Asn Phe He Pro Arg Ala Gly Glu Glu Gly Asp 

19.0 195 200 . 

Glu Thr Trp Gly Asn Asp Tyr Glu Ala Arg Trp Met Thr Gly Val 
205 210 215 

Trp Gly Gin He Thr Tyr Asp Pro Val Gly Gly Leu Val His Tvr 
220 225 230 

Gly Thr Ser Ala Val Gly Pro Ala Ala Glu He Gin Arg Gly Thr 
235 240. 245 
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. Val Gly Gly Ser. Met Tyr Gly Thr Asn Thf Arg Phe Ala Val Arg 
250 255 . 260 

Pro Glu Thr Gly 61u He Val Trp Arg His Gin Thr Leu Pro Arg 
265 .270 275 

Asp Asn Trp Asp Gin Glu Cys Thr Phe Glu Met Met Val Val Asn 

280 285 . .. 290 

Val Asp Val .Glh Pro Ser Ala Glu Met. Glu Gly Leu His Ala lie 
295 . 300 305 . 

Asn Pro Asp Ala Ala Thr Gly Glu. Arg Arg Val Val Thr Gly Val 
310 315 320 

Pro Cys Lys Asn Gly Thr Met Trp Gin Phe Asp Ala Glu Thr Gly 
325 330 335 

Glu Phe Leu Trp Ala Arg Asp Thr Ser Tyr Gin Asn Leu He Glu 
340 345 350 

Ser Val Asp Pro Asp Gly Leu Val His Val Asn Glu Asp Leu Val 
355 360 365 

Val Thr Glu Leu .Glu Val Ala Tyr Glu He Cys Pro Thr Phe Leu 
370 375 380 

Gly Gly Arg Asp Trp Pro Ser Ala Ala Leu Asn Pro Asp Thr Glv 
385 390 395 . 

He Tyr Phe He Pro Leu Asn Asn Ala Cys Ser Gly Met Thr Ala 
400 405 410 

Val Asp Gin Glu Phe Ser Ser Leu Asp Val Tyr Asn Val Ser Leu 
415 420 .425 

Asp Tyr Lys Leu Ser . Pro Gly Ser Glu Asn Met Gly Arg He Asp 
430 435 440 

Ala lie Asp He Ser Thr Gly Arg Thr Leu Trp Seir Ala Glu Arg 
445 450 455 

Tyr Ala Ser Asn Tyr Ala Pro Val Leu Ser Thr Gly Gly Gly Val 
460 465 470 

Leu Phe Asn Gly Gly Thr Asp Arg Tyr Phe Arg Ala Leu Ser Gin 
475 480 485 

Glu Thr Gly Glu Thr Leu Trp Gin Thr Arg Leu Ala Thr Val Ala 
490 495 500 
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Gly Val Gin Tyr Val 
515 

Ser Asn His Asn Arg 
530 

lie Gly Ser Ala He 

545 : . . . . 



INFORMATION FOR SEQ ro Np:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 residues 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

(ii) MOLEClUtE TYPE: protein 

(iii) ORIGINAL SOURCE: 

ORGANISM: Gluconobacter oxydans 
STRAIN; DSM4025 

(iv) FEATURE: 

FEATURE KEY: sig peptide 
POSITION: -23..-1 
SEQUENCING METHOD: E 



Ser Gly Gin Ala He Ser Tyr Glu He Asp 
• 505 510 

Ala He Gly Arg Gly Gly Thr Ser Tyr Gly 
,520 . 525 

Ala Leu Thr Glu Arg He Asp Ser Thr Ala 
535 . 540 

Tyr Val Phe Ala Leu Pro Gin Gin 
550 555 
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FEATURE KEY: mat peptide 
POSITION: 1..556 
SEQUENCING METHOD: E 



Met Asn Pro Thr Thr Leu Leu Arg Thr Ser Ala Ala Val Leu Leu 
-20 -15 _io 

Leu Thr Ala Pro Ala Ala Phe Ala Gin Val Thr Pro lie Thr Asp 
-5 1 . 5 

Glu Leu Leu Ala Asn Pro Pro Ala Gly Glu Trp He Asn Tyr Glv 
10 15 20 ■ 

Arg Asn Gin Glu Asn Tyr Arg His Ser Pro Leu Thr Gin He Thr 
25 . 30 . 35 . 

Ala Asp Asn Val Gly Gin Leu Gin Leu Val Trp Ala Arg Gly Met 
40 . 45 50 

Glu Ala Gly Ala Val Gin Val. Thr Pro Met He His Asp Gly Val 
55 60 65 

Met Tyr Leu Ala. Asn Pro Gly Asp Val He Gin Ala Leu Asp Ala 
70. 75 80 

Gin Thr Gly Asp Leu He Trp Glu His Arg Arg Gin Leu Pro Ala 
85 90 95 

Val Ala Thr Leu Asn Ala Gin Gly Asp Arg Lys Arg Gly Val Ala 
100 105 110 

Leu Tyr Gly Thr Ser Leu Tyr Phe Ser Ser Trp Asp Asn His Leu 
115 . 120 125 

He Ala Leu Asp Met Glu Thr Gly Gin Val Val Phe Asp Val Glu 
130 ,135 140 

Arg. Gly Ser Gly Glu Asp Gly Leu Thr Ser Asn Thr Thr Gly Pro 
145 . 150 155 

He Val Ala Asn Gly Val He Val Ala Gly Ser Thr Cys Gin Tvr 
160 165 170 

Ser Pro Tyr Gly Cys Phe He Ser Gly His Asp Ser Ala Thr Gly 
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.175 . . 180 185 . 

Glu Glu Leu Trp, Arg Asn His Phe lie Pro Gin Pro Gly Glu Glu 
190 - 195 200. 

Gly Asp Glu thr Trp .Gly Asn Asp Phe Glu Ala Arg Trp Met Thr 
205. .210 215 

Gly. Val Trp Gly Gin He. Thr Tyr Asp Pro Val Thr Asn Leu Val 
.220 225 230 

Phe Tyr. Gly Ser Thr Gly Val Gly Pro Ala Ser Glu Thr Gin Arg 
235 240 245. 

Gly . Thr Pro Gly Gly Thr Leu Tyr Gly Thr Asn Thr Arg Phe Ala 
250 255 . - 260 

Val Arg Pro Asp Thr Gly Glu He Val Trp Arg His Gin Thr Leu 
265 270 275 

Pro Arg Asp Asn Trp Asp Gin Glu Cys Thr Phe Glu Met Met Val 
280 285 290 

Ala Asn Val Asp Val Gin Pro Ser Ala Glu Met Glu Gly Leu Ara 
295 300 , 305 

Ala He Asn Pro Asn Ala Ala Thr Gly Glu Arg Arg Val Leu Thr 
310 315 320 

Gly . Ala Pro Cys Lys. Thr Gly Thr Met Trp Ser Phe Asp Ala Ala 
325 330 335 

Ser Gly Glu Phe Leu Trp Ala Arg Asp Thr Asn Tyr Thr Asn Met 
.340 345 350 

He Ala Ser He Asp Glii Thr Gly Leu Val Thr Val Asn Glu Asp 
355 360 365 

Ala Val Leu Lys Glu Leu Asp Val Glu Tyr Asp Val Cys Pro Thr 
370 375 380 ' 

Phe Leu Gly Gly Arg Asp Trp Ser Ser Ala Ala Leu Asn Pro Asp 
385 390 395 

Thr Gly He Tyr. Phe Leu Pro Leu Asn Asn Ala Cys Tyr Asp He 
400 405 410 

Met Ala Val Asp Gin Glu Phe Ser Ala Leu Asp Val Tyr Asn Thr 
415 420 425 



22 



Ser Ala Thr Ala Lys. Leu Ala Pro Gly Phe Glu Asn Met Gly Ara 
430 435 .440 

He Asp Ala He Asp. He Ser Thr Gly Arg Thr Leu Trp Ser Ala 
445 450 455 ' 

Glu Arg Pro Ala Ala Asn Tyr Ser Pro Val Leu Ser Thr Ala Glv 
460 465 . 470 . . 

Gly Val Val Phe Asn Gly Gly Thr Asp Arg Tyr Phe Arg Ala Leu 
475 480 . .485 

Ser Gin Glu Thr Gly Glu Thr Leu Trp Gin Ala Arg Leu Ala Thr 
490 . 495 5Q0 

Val Ala Thr Gly Gin Ala He Ser Tyr Glu Leu Asp Gly Val Gin 
. 510 . 515 

^^%^}^ ^-^^ Leu Thr Tyr Gly Thr Glii Leu 

520 525 530 

Asn Ala Pro Leu Ala Glu Ala He . Asp Ser thr Ser Val Gly Asn 
535 .540 545 

Ala He Tyr Val Phe Ala Leu Pro Gin 
550 555 
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INFORMATION FOR SEQ ID NO:9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 82 bases 
OB) TYPE: nucleotide 
(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) ORIGINAL SOURCE: synthetic oUgonuclebfide 

CATGAAAATA AAAACAGGTG CACGCATCCT CGCATTATCC GCATTAACGA 50. 
CGATGATGTT TTCCGCCTCG GCTCTCGCCC AG .82 

INFORMATION FOR SEQ ED NO:10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 83 bases 

(B) TYPiE: nucleotide 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iu) ORIGINAL SOURCE: synthetic oUgonucleotide 

GTTACCTGGG CGAGAGCCGA GGCGGAAAAC ATCATCGTCG TTAATGCGGA 50 
TAATGCGAGG ATGCGTGCAC CTGTTTTTAT TTT 83 
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INFOR3VIATION FOR SEQ ID NO:ll: 

(i) SEQUENCiE CHARACTERISTICS: 

(A) LENGTH: 27 residues 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) ORIGINAL SOURCE: E. coli 

(iv) FEATURE: 

FEATURE KEY: sig peptide 
POSITION: 1..26 
FEATURE METHOD: S 

Met Lys He Lys Thr Gly Ala Arg He Leu Ala Leu Ser Ala Leu 
I 5 10 15 

Thr Thr Met Met Phe Ser Ala Ser Ala Leu Ala Gin 
20 25 27 



INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 bases 

(B) TYPE: nucleotide 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) ORIGINAL SOURCE: synthetic oUgonucleotide 
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GTTAGCGCGG TGGATCCCCA TTGGAGG 27 
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Attachment (B) 

Sequences of the amplifiied products. 
39F903 (697-1000) /A697f.Seq 

TTNCGTGCCT .GGGGCCAGAT CACCTATGAC .. CCCGTCACCA ACCTTGTCCA 
CTACGGCTCG ACCGCTGTGG GTCCGGCGTC GGAAACCCAA CGCGGCACCC 
..CGGGCGGCAC.GCTGTACGGC ACGAACACCC. GTTTCGCCGT GCGTCCTGAC 
ACGGGCGAGA TTGTCTGGCG TCACCAGACC CTGCCCCGCG ACAACTGGGA ■ 
CCAGGAATGC ACGTTCGAGA TGATGGTCAC CAATGTGGAT . GTCCAACCCT 
CGACCGAGAT GGAAGGTCTG CAGTCGATCA ANCGAAANim NNNNmi^^ 
■NNNNN 

41F903 (697-1000) /A1000r.Seg 

TTCCTCTTGG TCGAGGGTTG GACATCCACA TTGGTGACCA TCATCTCGAA 
CGTGCATTCC TGGTCCCAGT TGTCGCGGGG CAGGGTCTGG TGACGCCAGA 
GAATCTCGCC CGTGTCAGGA CGCACGGCGA AACGGGTGTT CGTGCCGTAC 
AGCGTGCCGC CCGGGGTGCC GCGTTGGGTT TCCGACGCCG. GACCCACAGC " 
GGTCGAGCCG TAGTGGACAA GGTTGGTGAC GGGGTCATAG GTGATCTGGC " 
CCCAGGCACC GGTCATCCAA CGGGCTTTGT AANNNNNNNN NNNNNNNNNN 
N 

43F9b3 (479-780) /A479f.Seq 

aaagcacttt atggnctcga actctccggc ccgattgtcg ccaatggcgt 

CATCGTTGCG GGCTCGACCT GTCAGTATTC GCCGTTCGGC TGTTTCGTTT 
GGGGCCACGA CTCGGCCACC. GGTGAAGAGC TGTGGCGCAA CACCTTTATC 
CCGCGCGCCG GCGAAGAGGG TGATGAGACC TGGGGCAATG ATTACGAGGC 
CCGCTGGATG ACCGGCGTTT GGGGCCAGAT CACCTATGAC CCCGTTGGCG 

gccttgtcca ctacggcacc tcaagagtta anannnnnnn nnnnnnnnn 

45F903 (479-780) /A780r.Seq 

gacaaggctn ncacggngtc ataggtgatn tggccccaaacgccggtcat 
ccagcgggcc TCGTAATCAT tgccccaggt ctcatcaccc tcttcgccgg 
cgcgcgggat aaaggtgttg cgccacagct cttcaccggt ggccgagtcg 
tggcccgaaa cgaaacagcc gaacggcgaa tactgacagg tcgagcccgc 
aacgatgacg ccattggcga caatcgggcc ggacgagttc gagaccatat • 

CCGTGCCTTG ACCGCGATCG ACGTCCATAA ANNNNNNNNN NNNNNNNNNN 
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